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Abstract 

Knowledge structures called Concept 
Clustering Knowledge Graphs (CCKGs) 
are introduced along with a process for 
their construction from a machine read- 
able dictionary. CCKGs contain multi- 
ple concepts interrelated through multi- 
ple semantic relations together forming 
a semantic cluster represented by a con- 
ceptual graph. The knowledge acquisi- 
tion is performed on a children's first dic- 
tionary. The concepts involved are gen- 
eral and typical of a daily life conversa- 
tion. A collection of conceptual clusters 
together can form the basis of a lexi- 
cal knowledge base, where each CCKG 
contains a limited number of highly con- 
nected words giving useful information 
about a particular domain or situation. 

1 Introduction 

When constructing a Lexical Knowledge Base 
(LKB) useful for Natural Language Processing, 
the source of information from which knowledge 
is acquired and the structuring of this informa- 
tion within the LKB are two key issues. Ma- 
chine Readable Dictionaries ( MRDs) are a good 
source of lexical information and have been shown 
to be applicable to the task of LKB construction 



( Dolan ct al., 1993; |Calzolari, 1992 ; ^opestake 



19901 ; |Wilks ct al., 19891 |Byrd ct al., 19871) . Often 
though, a localist approach is adopted whereby 
the words ar e kept in alphabetical order with 
some representation of their definitions in the form 
of a template or feature structure. Efforts in find- 
ing connections between words is seen in work on 
automa tic extraction of semantic relations from 
MRDs (lAhTswede and Evens, 1988]; |Alshawi, 198£ 



tically is seen by the current interest in statisti- 
cal techniques for word clustering, looking at co- 
occurrences of words in text corpora or dictionar- 
ies (phurch and Hanks, 198$ |Wilks et al., 1986 
?; iPereira et al., 1995| ). 



Inspired by research in the areas of semantic 
relations, semantic distance, concept clustering, 
and using Conceptual Graphs ( Sowa, 1984 ) as our 
knowledge representation, we introduce Concept 
Clustering Knowledge Graphs (CCKGs). Each 
CCKG will start as a Conceptual Graph represen- 
tation of a trigger word and will expand following 
a search algorithm to incorporate related words 
and form a Concept Cluster. The concept clus- 
ter in itself is interesting for tasks such as word 
disambig uation, but the CCKG will give more to 
that cluster. It will give the relations between the 
words, making the graph in some aspects similar 



to a script (3chank and Abelson, 1975). However, 
a CCKG is generated automatically and does not 
rely on primitive s but on an unlimited number 
of concepts, showing objects, persons, and actions 
interacting with each other. This interaction will 
be set within a particular domain, and the trigger 
word should be a key word of the domain to repre- 
sent. If that process w ould be done for the whole 
dictionary, we would obtain an LKB divided into 
multiple clusters of words, each represented by a 
CCKG. Then during text processing for example, 
a portion of text could be analyzed using the ap- 
propriate CCKG to find implicit re lations and 
help understanding the text. 

Our source of knowledge is the American Her- 
itage First Dictionary^] which contains 1800 en- 
tries and is designed for childr en of age six to 
eight. It is made for young people learning the 
structure and the basic vocabulary of their lan- 
guage. In comparison, an adult's dictionary is 
more of a reference tool which assumes knowl- 



Montcmagni and Vandcrwcnde, 1992 ). Addition- 
ally, efforts in finding words that are close seman- 



1 Copyright ©1994 by Houghton Mifflin Company. 
Reproduced by permission from THE AMERICAN 
HERITAGE FIRST DICTIONARY. 



edge of a large basic vocabulary, while a learner's 
dictionary assumes a limited vocabulary but still 
some very sophisticated concepts. Using a chil- 
dren's dictionary allows us to restrict our vocab- 
ulary, but still work on general knowledge about 
day to day concepts and actions. 

In the following sections, we first present the 
transformation steps from the definitions into con- 
ceptual graphs, then we elaborate on the integra- 
tion process, and finally, we close with a discus- 
sion. 

2 Transforming Definitions 

Our definitions may contain up to three general 
types of information, as shown in the examples in 
Figure [|. 



This contains genus/differentia 
Such information is fr equently 



description: 

information. 

used for n oun taxonomy constru c tion ( Bvrd et 
al.. 1987j: |Klayans et al., 1990[ 



Popowich, f99if ) 



Barriere and 



general knowledge or usage: This gives in- 
formation useful in daily life, like how to use an 
object, what it is made of, what it looks like, etc. 

specific example: This presents a typical situ- 
ation using the word defined and it involves spe- 
cific persons and actions. 



Cereal is a kind of food, [description] 

Many cereals arc made from corn, wheat, or rice, [usage] 
Most people cat cereal with milk in a bowl, [usage] 

Ash is what is left after something burns, [usage] 
It is a soft gray powder, [description] 

Ray watched his father clean the ashes out of the fireplace, 
[example] 



3. the relations that are extracted from the syntac- 
tic structure of a sentence, ex: subject, object, 
goal, attribute, modifier. 

As some relations are defined using the closed 
class words, and many of those words are ambigu- 
ous, the resulting graph will itself be ambiguous. 
This is the main reason for calling our graphs 
temporary as we assume a conceptual graph, the 
ultimate goal of our translation process, should 
contain a restricted set of well-defined and non- 
ambiguous semantic relations. For example, by 
can be a relation of manner (by chewing), time 
(by noon) or place (by the door). By keeping th 
e preposition itself within the temporary graph, 
we delay the ambiguity resolution process until 
we have gathered more information and we even 
hopefully avoid the decision process as the ambi- 
guity might later be resolved by the integration 
process itself. 



1. closed class words 


temporary graph 


np:np[AJ,prcp[BJ,np[CJ 
apple on the tabic 


[A]->(B)->[C] 
[applc]-> (on)-> [table] 


2. defining formulas 


temporary graph 


A is used to B 

A is a part of B 

A is a place where B 


[B]->(instrument)->[A] 

[A] ->(part-of)->[B] 

[B] ->(Joc)->[A] 


3. syntactic pattern 


temporary graph 


s:np[AJ,vp[BJ 
John eats 

vp:vp[A] ,inf_vp[B] 
eat to grow 


[B]->(agent)->[A] 
[cat]- > (agent )-> [John] 

[A]->(goal)->[B] 
[eat]->(goal)-> [grow] 



Table 1: Examples of relations found in sentences 
and their corresponding temporary graphs 



Figure 1: Example of definitions 



3 Knowledge Integration 



The information given by the description and 
general knowledge will be used to perform the 
knowledge integration proposed in section |[ The 
specific examples are excluded as they tend to in- 
volve specific concepts not always deeply related 
to the word defined. 

Our processing of the definitions results in the 
construction of a special type of conceptual graph 
which we call a temporary graph. The set of rela- 
tions used in temporary graphs come from three 
sources. Table [l] shows some examples for each 
type. 

1. the set of closed class words, ex: of, to, in, and; 

2. relations extracted via defining formulas ex: part- 
of, raade-of, instrument; defining formulas cor- 
respond to phrasal patterns that occur often 
through the dictionary suggesting particu 
mantic rela tions (ex. A is a part of B) (?; 
et al., 1993; ). 



Dolan 



This section describes how given a trigger word, 
we perform a series of forward and backward 
searches in the dictionary to build a CCKG con- 
taining useful information pertaining to the trig- 
ger word and to closely related words. The pri- 
mary building blocks for the CCKG are the tem- 
porary graphs built from the dictionary definitions 
of those words using our transformation process 
described in the previous section. Those tempo- 
rary graphs express similar or related ideas in dif- 
ferent ways and with different 1 evels of detail. As 
we will try to put all this information together 
into one large graph, we must first find what in- 
formation the various temporary graphs have in 
common and then join them around this common 
knowledge. 

To help us build this CCKG and perform our 
integration process, we assume two main knowl- 
edge structures are available, a concept hierarchy 



and a relation hierarchy, and we assume the exis- 
tance of some graph operations. The concept hi- 
erarchy concentrates on nouns and verbs as they 
account for three quarters of the dictionary def- 
initions. It has been constructed automatically 



Semantic distance between concepts. In 

the maximal common subgraph algorithm pro- 
posed by (3owa, 1984), two concepts (C1,C2) 



according to the techniques described in ( Barriere 
and Popowich, 1996| ) . The relation hierarchy was 
constructed manually. A rich hierarchical struc- 
ture between the set of relations is essential to the 
graph matching operations we use for the integra- 
tion phase. 

As we are using the conceptual graph formalism 
to represent our definitions, we can use the graph 
matching operations defined in (Sowa, 1984). The 
two operations we will need are the Maximal Com- 
mon Subgraph algorithm and the Maximal Join 
algorithm. 

3.1 Maximal Common Subgraph 

The maximal common subgraph between two 
graphs consists of finding a subgraph of the first 
graph that is isomorphic to a subgraph of the sec- 
ond graph. In our case, we cannot often expect to 
find two graphs that contain an identical subgraph 
with the exac t same relations and concepts. Ideas 
can be expressed in many ways and we therefore 
need a more relaxed matching schema. We de- 
scribe a few elements of this "relaxation" process 
and illustrate them by an example in Figure |[ 



(1) John makes a nice drawing on a piece of paper with the pen. 
[make]- > (sub)- > [John] 

-> (obj)- > [drawing] - > (att)-> [nice] 
-> (on)- > [piece]- > (of )-> [paper] 
->(with)->[pcn] 

(2) John uses the big crayon to draw rapidly on the paper. 
[draw]-> (snb)-> [John] 

->(on)->[paper] 

-> (instrument )-> [crayon] 

-> (manner )-> [rapidly] 

MAXIMAL COMMON SUBGRAPH: 
[make (draw)] - > (sub)- > [John] 

- > (obj )- > [drawing] 

-> (on)- > [piece]- > (of )- > [paper] 

->(instrumcnt)-> [label- 1] 



Relaxation method 


graphl 


graph2 


Semantic distance 
Relation subsumption 
Predictable meaning shift 
Relation transitivity 


pen 
with 
drawing 
piece of paper 


crayon 
instrument 
draw 
paper 



MAXIMAL JOIN: 

[make (draw)] - > (sub)- > [John] 

-> (obj)- > [drawing] - > (att)-> [nice] 
-> (on)- > [piece]- > (of )- > [paper] 
-> (instrument )-> [label- 1] 
-> (manner )-> [rapidly] 



Figure 2: Example of "relaxed" maximal common 
subgraph and maximal join algorithms 



could be matched if one subsumed the other in 
the concept hierarchy. We can relax that criteria 
to match two concept s when a third concept C 
which subsumes CI and C2 has a high enough de- 
gree of informativeness (Resnik, 1995). The con- 
cept hierarchy can be useful in many cases, but it 
is generated from the dictionary and might not be 
complete enough to find all simila r concepts. 

In the example of Figure |2| when using the con- 
cept hierarchy to establish the similarity between 
pen and crayon, we find that one is a subclass 
of tool and the other of wax, both then arc sub- 
sumed by the general concept something. We have 
reached the root of the noun tree in the concept hi- 
erarchy and this would give a similarity of based 
on the informativeness notion. 

We extend the subsumption notion to the 
graphs. Instead of finding a concept that sub- 
sumes two concepts, we will try finding a common 
subgraph that subsumes the graph representation 
of both concepts. In our example, pen and crayon 
have a common subgraph [write]->(inst)->[]. The 
notion of semantic distance can be seen as the in- 
formativeness of the subsuming graph. The re- 
sulting maximal common subgraph as shown in 
Figure^ cont ains the concept label- 1. This label 
is associated to a covert category as presented in 
( pBarricrc and Popowich, 1996 ). We can update 
the concept hierarchy and add this label- 1 as a 
subclass of something and a superclass of pen and 
em crayon. It expresses a concept of "writing in- 
strument" . 

Relation subsumption. Since we have a re- 
lation hierarchy in addition to our concept hier- 
archy, we can similarly use subsumption to match 
two relations. In Figure^, with is subsumed by in- 
strument, and by mapping them, we disambiguate 
with from corresponding to another semantic rela- 
tion, such as possession or accompaniment. This 
is a case where an ambiguous prepositi on left in 
the temporary graph is resolved by the integration 
process. 

Predictable meaning shift. A set of lexical 



implication rules were developed by (Ostler and 



Atkins, 1992 ) for relating word senses. Based on 
them, we are developing a set of graph match- 
ing rules. Figure || exemplifies one of them whe re 
two graphs containing the same word (or morpho- 
logically related), here draw and drawing, used as 
different parts of speech can be related. 



Relation transitivity. Some relations, like 
part-of, in, from can be transitive. For example, 
we can map a graph that contains a concept A in 
a certain relation to concept B onto another graph 
where concept A is in the same relation with a part 
or a piece of B as exemplified in Figure U Tran- 
sitivity in relations is in itself a challenging area 
of study ( Cruse, 1986| ) and we have only begun to 
explore it. 

3.2 Maximal Join 

The basic operation for the integration of tempo- 
rary graphs is the maximal join operation where a 
union of two graphs is formed around their max- 
imal common subgraph using the most specific 
concepts of each. We just saw how to relax the 
maximal common sub graph operation and we will 
perform the join around that "relaxed" subgraph. 
Figure || shows the result of the maximal join. 
The join operation allows us to bring new con- 
cepts into a graph by finding relations with ex- 
isting concepts, as well as bringing new relations 
between existing concepts. 

3.3 Integration process 

Given the concept hierarchy, relation hierarchy 
and graph matching operations, we now describe 
the two major steps required to integrate all the 
temporary graphs into a CCKG. 

TRIGGER PHASE. Start with a central 
word, a keyword for the subject of interest that 
becomes the trigger word. The temporary graph 
built from the trigger word forms the initial 
CCKG. To expand its meaning, we want to look 
at the important co ncepts involved and use their 
respective temporary graphs to extend our initial 
graph. We deem words in the definition to be im- 
portant if they have a large semantic weight. 
The semantic weight of a word or its informa- 



tivcn ess can be related to its frequency ( Resnik 
1995). Here, we calculate the number of occur- 



rence of each word within the definitions of nouns 
and verbs in our dictionary. The most frequent 
word "a" occ urs 2600 times among a total of 
38000 word occurrences. Only 1% of the words 
occur more than 130 times, 5% occur more than 
30 times but over 60% occur less than 5 times. 

Ordering the dictionary words in terms of de- 
creasing number of occurrences, the top 10% of 
these words account for 75% of word occurrences. 
For our current investigation, we propose this 
as the division between semantically significant 
words, and semantically insignificant ones. So a 
word from the dictionary is deemed to be seman- 
tically significant if it occurs less than 17 times. 



Note that constraining the number of semanti- 
cally significant words is important in limiting the 
exploration process for constructing the concept 
cluster, as we shall soon see. 

Trigger forward: Find the semantically signif- 
icant words part of the CCKG, and join 
their respective temporary graph to the ini- 
tial CCKG. 

Trigger backward: Find all the words in the 
dictionary that use the trigger word in their 
definition and join their respective temporary 
graph to the CCKG. 

Instead of a single trigger word, we now have 
a cluster of words that are related through the 
CCKG. Those words form the concept cluster. 

EXPANSION PHASE. We try finding words 
in the dictionary containing many concepts iden- 
tical to the ones already present in the CCKG but 
perhaps interacting through different relations al- 
lowing us to create additional links within the set 
of con cepts present in the CCKG. Our goal is to 
create a more interconnected graph rather than 
sprouting from a particular concept. For this rea- 
son, we establish a graph matching threshold to 
decide whether we will join a new graph to the 
CCKG being built. W e set this threshold empir- 
ically: the maximal common subgraph between 
the CCKG and the new temporary graph must 
contain at least three concepts connected through 
two relations. 

Expansion forward: For each semantically 
significant word in the CCKG, not already 
part of the concept cluster, find the maxi- 
mal common subgraph between its temporary 
graph and the CCKG. If matching surpasses 
the graph matching threshold, perform inte- 
gration (maximal join operation) and add the 
word in the concept cluster. Continue for- 
ward until no changes are made. 

Expansion backward: Find words in the dic- 
tionary whose definitions contain the seman- 
tically significant words from the concept 
cluster. For each possible new word, perform 
the maximal common subgraph between its 
temporary graph and the CCK G. Again, if 
matching is over the graph matching thresh- 
old, perform integration and add the word 
in the concept cluster. Continue until no 
changes are made. 

We can set a limit to the number of steps in the 
expansion phase to ensure its termination. How- 
ever in practice, after two or three steps forward 



STARTING POINT: 
TW: letter 

Def: A letter is a message you write on paper. 
TG: same as CCKG 
CC: {letter} 

CCKG: [write]- >(obj)->[mcssage(letter)] 
-> (subj)-> [persomyou] 
->(on)->[paper] 

TRIGGER FORWARD: 

NOccs: you:280, papcr:42, write:31, 

messagc:7 
SSWs: message 

Def: A message is a group of words that is sent 
from one person to another. 

Many people send messages through the mail. 
CC: {letter, message} 
CCKG: 

[word :group (message (letter))] 

<-(obj) <- [write]- > (sub)- > [person :you] 

->(on)-> [paper] 
<- (obj ) <- [send]- > (sub j )-> [pcrsommany] 
-> (from)-> [persomone] 
-> (to)-> [pcrsomanothcr] 
->(through)-> [mail] 



or backward, the maximal common subgraphs be- 
tween the new graphs and CCKG do not exceed 
the graph matching t hreshold and thus are not 
added to the cluster, terminating the expansion. 

3.4 Example of integration 

Figure |^ shows the starting point of an integra- 
tion process with the trigger word (TW) letter, its 
definition, its temporary graph (TG), the concept 
cluster (CC) containing only the trigger word, and 
the CCKG being the same as the temporary g 
raph. Then we show the trigger forward phase. 
The number of occurences (NOcc) of each word 
present in the definition of letter is given. Us- 
ing the criteria described in the previous section, 
only the word message is a semantically significa 
nt word (SSW). We then see the definition of mes- 
sage, the new concept cluster and the resulting 
CCKG. 

The trigger backward phase, would incorporate 
the temporary graphs for address, mail, post of- 
fice and stamp. The expansion forward phase 
would further add the temporary graphs for the 
semantically significant words: {send, package} 
du ring the first step and then would terminate 
with the second step as no more semantically sig- 
nificant words not yet explored have a maximal 
common subgraph with the CCKG that exceeds 
the graph matching threshold. The expansion 
backward would finally add the TGs for card and 
note, again terminating after two steps. 

The resulting cluster is: {letter, message, ad- 
dress, mail, post office, stamp, send, package, 
card, note}. The resulting CCKG shows the in- 
teraction between those concepts which summa- 
rizes general knowledge about how we use those 
concepts together in a daily conversation: we go 
to the post office to mail letters, or packages; we 
write letters, notes and card to send to people 
through the mail, etc. Having such clusters and 
such knowledge of the relationship between words 
as part of our lexical knowledg e base can be useful 
to understand or even generate a text containing 
the concepts involved in the cluster. 

4 Discussion 

Through this paper, we showed the multiple steps 
leading us to the building of Concept Clustering 
Knowledge Graphs (CCKGs). Those knowledge 
structures are built within the Lexical Knowl- 
edge Base (LKB), integrating multiple parts of the 
LKB around a parti cular concept to form a clus- 
ter and express the multiple relations among the 
words in that cluster. The CCKGs could be either 
permanent or temporary structures depending on 



Figure 3: Trigger forward from letter. 

the application using the LKB. For example, for a 
text understanding task, we ca n build before hand 
the CCKGs corresponding to one or multiple key- 
words from the text. Once built, the CCKGs will 
help us in our comprehension and disambiguation 
of the text. 

By using the American Heritage First Dictio- 
nary as our source of lexical information, we were 
able to restrict our vocabulary to result in a 
project of reasonable size, dealing with general 
knowledge about day to day concepts and actions. 
The ideas explo red using this dictionary can be 
extended to other dictionaries as well, but the task 
might become more complex as the definitions in 
adult's dictionaries are not as clear and usage ori- 
ented. In fact, an LKB built from a children's 
dictionary could be se en as a starting point from 
which we could extend our acquisition of knowl- 
edge using text corpora or other dictionaries. Cer- 
tainly, if we envisage applications trying to under- 
stand children's stories or help in child education, 
a corpora of texts for chi ldren would be a good 
source of information to extend our LKB. 

The graph operations (maximal common sub- 
graph and maximal join) defined on conceptual 
graphs, and adapted here, play an important role 
in our integration process toward a final CCKG. 
Graph matching was also suggested as an alterna- 
tive to taxonomic search when trying to establish 
semantic similarity between concepts. As well, by 
putting a threshold on the graph matching pro- 
cess, we were able to limit the expansion of our 



clustering, a s we can decide and justify the incor- 
poration of a new concept into a particular cluster. 

Many aspects of the concept clustering and 
knowledge integration processes have already been 
implemented and it will soon be possible to test 
the techniques on different trigger words using dif- 
ferent thresholds to see how they effect the quality 
of the cl usters. 

Clustering is often seen as a statistical opera- 
tion that puts together words "somehow" related. 
Here, we give a meaning to their clustering, we 
find and show the connections between concepts, 
and by doing so, we build more than a cluster of 
words. We b uild a knowledge graph where the 
concepts interact with each other giving impor- 
tant implicit information that will be useful for 
Natural Language Processing tasks. 
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